Extraction of Visual Features for Lipreading

نویسندگان

  • Iain A. Matthews
  • Timothy F. Cootes
  • J. Andrew Bangham
  • Stephen J. Cox
  • Richard Harvey
چکیده

The multi-modal nature of speech is often ignored in human-computer interaction but lip deformation, and other body such as head and arm motion all convey additional information. We integrate speech cues from many sources and this improves intelligibility, especially when the acoustic signal is degraded. This paper shows how this additional, often complementary, visual speech information can be used for speech recognition. Three methods for parameterising lip image sequences for recognition using hidden Markov models are compared. Two of these are top-down approaches that fit a model of the inner and outer lip contours and derive lipreading features from a principal component analysis of shape, or shape and appearance respectively. The third, bottom-up, method uses a nonlinear scale-space analysis to form features directly from the pixel intensity. All methods are compared on a multi-talker visual speech recognition task of isolated letters. Now at, Human-Computer Interaction Institute, Carnegie Mellon University, Pittsburgh, USA

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improvement of lipreading performance using discriminative feature and speaker adaptation

In this paper, we apply a general and discriminative feature ”GIF” (Genetic Algorithm based Informative feature) to lipreading (visual speech recognition), and improve the lipreading performance using speaker adaptation. The feature extraction method consists of two transforms, which convert an input vector into GIF for recognition. In the speaker adaptation, MAP (Maximum A Posteriori) adaptati...

متن کامل

Improved ROI and within frame discriminant features for lipreading

We study three aspects of designing appearance based visual features for automatic lipreading: (a) The choice of the video region of interest (ROI), on which image transform features are obtained; (b) The extraction of speech discriminant features at each frame; and (c) The use of temporal information to improve visual speech modeling. In particular, with respect to (a), we propose a ROI that i...

متن کامل

Comparing visual features for lipreading

For automatic lipreading, there are many competing methods for feature extraction. Often, because of the complexity of the task these methods are tested on only quite restricted datasets, such as the letters of the alphabet or digits, and from only a few speakers. In this paper we compare some of the leading methods for lip feature extraction and compare them on the GRID dataset which uses a co...

متن کامل

A Multibiometric Speaker Authentication System with SVM Audio Reliability Indicator

Performances of biometric speaker authentication systems are good in clean conditions but their reliability drops severely in noisy environments. Implementation of multibiometric systems using audio and visual experts is one of the solutions to this limitation. In this study, weighting for fusing the audio and visual expert scores is proposed to be adapted corresponding to the current environme...

متن کامل

Speaker - Independent Visual Lip Activity Detection for Human - Computer Interaction

Recently there is an increased interest in using the visual features for improved speech processing. Lip reading plays a vital role in visual speech processing. In this paper, a new approach for lip reading is presented. Visual speech recognition is applied in mobile phone applications, human-computer interaction and also to recognize the spoken words of hearing impaired persons. The visual spe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Trans. Pattern Anal. Mach. Intell.

دوره 24  شماره 

صفحات  -

تاریخ انتشار 2002